Goto

Collaborating Authors

 riemannian gradient descent





Distributed Principal Component Analysis with Limited Communication

Neural Information Processing Systems

However, in many settings the large volume of data requires processing by multiple nodes, which has inspired significant work on reducing the distribution overheads of PCA, e.g.



Landing with the Score: Riemannian Optimization through Denoising

Kharitenko, Andrey, Shen, Zebang, de Santi, Riccardo, He, Niao, Doerfler, Florian

arXiv.org Machine Learning

Under the data manifold hypothesis, high-dimensional data are concentrated near a low-dimensional manifold. We study the problem of Riemannian optimization over such manifolds when they are given only implicitly through the data distribution, and the standard manifold operations required by classical algorithms are unavailable. This formulation captures a broad class of data-driven design problems that are central to modern generative AI. Our key idea is to introduce a link function that connects the data distribution to the geometric operations needed for optimization. We show that this function enables the recovery of essential manifold operations, such as retraction and Riemannian gradient computation. Moreover, we establish a direct connection between our construction and the score function in diffusion models of the data distribution. This connection allows us to leverage well-studied parameterizations, efficient training procedures, and even pretrained score networks from the diffusion model literature to perform optimization. Building on this foundation, we propose two efficient inference-time algorithms -- Denoising Landing Flow (DLF) and Denoising Riemannian Gradient Descent (DRGD) -- and provide theoretical guarantees for both feasibility (approximate manifold adherence) and optimality (small Riemannian gradient norm). Finally, we demonstrate the effectiveness of our approach on finite-horizon reference tracking tasks in data-driven control, highlighting its potential for practical generative and design applications.

  manifold, optimization, riemannian gradient descent, (13 more...)
2509.23357

The Burer-Monteiro SDP method can fail even above the Barvinok-Pataki bound

Neural Information Processing Systems

This motivates the following question: When does the Burer-Monteiro method converge to a globally optimal solution? This question has attracted much recent interest on the theoretical front.


Fast and Provable Tensor-Train Format Tensor Completion via Precondtioned Riemannian Gradient Descent

Bian, Fengmiao, Cai, Jian-Feng, Zhang, Xiaoqun, Zhang, Yuanwei

arXiv.org Artificial Intelligence

Low-rank tensor completion aims to recover a tensor from partially observed entries, and it is widely applicable in fields such as quantum computing and image processing. Due to the significant advantages of the tensor train (TT) format in handling structured high-order tensors, this paper investigates the low-rank tensor completion problem based on the TT-format. We proposed a preconditioned Riemannian gradient descent algorithm (PRGD) to solve low TT-rank tensor completion and establish its linear convergence. Experimental results on both simulated and real datasets demonstrate the effectiveness of the PRGD algorithm. On the simulated dataset, the PRGD algorithm reduced the computation time by two orders of magnitude compared to existing classical algorithms. In practical applications such as hyperspectral image completion and quantum state tomography, the PRGD algorithm significantly reduced the number of iterations, thereby substantially reducing the computational time.


Tensor train completion: local recovery guarantees via Riemannian optimization

Budzinskiy, Stanislav, Zamarashkin, Nikolai

arXiv.org Artificial Intelligence

The problem of recovering algebraically structured data from scarce measurements has already become a classic one. The data under consideration are typically sparse vectors or low-rank matrices and tensors, while the measurements are obtained by applying a linear operator that satisfies a variant of the so-called restricted isometry property (RIP) [1]. In this work, we focus on tensor completion, which consists in recovering a tensor in the tensor train (TT) format [2, 3] from a small subset of its entries. Specifically, we consider it as a Riemannian optimization problem [4, 5] on the smooth manifold of tensors with fixed TT ranks and derive sufficient conditions (essentially, the RIP) for local convergence of the Riemannian gradient descent. We further estimate the number of randomly selected entries of a tensor with low TT ranks that is sufficient for the RIP to hold with high probability and, as a consequence, for the Riemannian gradient descent to converge locally.


Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints

Ablin, Pierre, Vary, Simon, Gao, Bin, Absil, P. -A.

arXiv.org Artificial Intelligence

Orthogonality constraints naturally appear in many machine learning problems, from Principal Components Analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin and Peyré (2022) proposed the Landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraint but is attracted towards the manifold in a smooth manner. In this article, we provide new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints.